Reformulating Prosodic Break Model into Segmental HMMs and Information Fusion
نویسندگان
چکیده
In this paper, a method for prosodic break modelling based on segmental-HMMs and Dempster-Shafer fusion for speech synthesis is presented, and the relative importance of linguistic and metric constraints in prosodic break modelling is assessed . A context-dependent segmental-HMM is used to explicitly model the linguistic and the metric constraints. Dempster-Shafer fusion is used to balance the relative importance of the linguistic and the metric constraints into the segmental-HMM. A linguistic processing chain based on surface and deep syntactic parsing is additionally used to extract linguistic informations of different nature. An objective evaluation proved evidence that the optimal combination of the linguistic and the metric constraints significantly outperforms both the conventional HMM (linguistic information only) and segmental-HMM (equal balance of linguistic and metric constraints), and confirmed that the linguistic constraint is prior to the metric.
منابع مشابه
Automatic segmental and prosodic labeling of Mandarin speech database
In this paper we describe the techniques and methodology developed for automatic labeling of segmental and prosodic information for the Mandarin speech database. There are two major procedures. First, the text is converted into the phonetic network of possible pronunciations, and this network is aligned with the speech data by recognition processes. Secondly, many acoustic prosodic features are...
متن کاملNoise Robust Speech Recognition Using Prosodic Information
This paper proposes a noise robust speech recognition method for Japanese utterances using prosodic information. In Japanese, the fundamental frequency (F0) contour conveys phrase intonation and word accent information. Consequently, it also conveys information about prosodic phrase and word boundaries. This paper first proposes a noise robust F0 extraction method using the Hough transform, whi...
متن کاملNoise Robust Speech Recognitio Extracted by Hough Tr
This paper proposes a noise robust speech recognition method using prosodic information. In Japanese, fundamental frequency (F0) contour represents phrase intonation and word accent information. Consequently, it conveys information about prosodic phrase and word boundaries. This paper first proposes a noise robust F0 extraction method using Hough transform, which achieves high extraction rates ...
متن کاملNoise robust speech recognition using F0 contour extracted by hough transform
This paper proposes a noise robust speech recognition method using prosodic information. In Japanese, fundamental frequency (F0) contour represents phrase intonation and word accent information. Consequently, it conveys information about prosodic phrase and word boundaries. This paper first proposes a noise robust F0 extraction method using Hough transform, which achieves high extraction rates ...
متن کاملChinese dialect identification using segmental and prosodic features.
Several approaches to Chinese dialect identification based on segmental and prosodic features of speech are described in this paper. When using segmental information only, the system performs phonotactic analysis after speech utterances have been tokenized into sequences of broad phonetic classes. The second scheme comprises prosodic models which are trained to capture tone sequence information...
متن کامل